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ABSTRACT 

We investigate the use of the Multiple Optimised Parameter Estimation and Data compression 
algorithm (MOPED) for data compression and faster evaluation of likelihood functions. Since 
MOPED only guarantees maintaining the Fisher matrix of the likelihood at a chosen point, 
multimodal and some degenerate distributions will present a problem. We present examples 
of scenarios in which MOPED does faithfully represent the true likelihood but also cases 
in which it does not. Through these examples, we aim to define a set of criteria for which 
MOPED will accurately represent the likelihood and hence may be used to obtain a significant 
reduction in the time needed to calculate it. These criteria may involve the evaluation of the 
full likelihood function for comparison. 
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1 INTRODUCTION 

, Multiple Optimised Parameter E stimation and Data compression 

■ (MOPED; iHeavens et all ilOOd) ) is a patented algorithm for the 
I compression of data and the speeding up of the evaluation of like- 
i lihood functions in astronomical data analysis and beyond. It be- 

■ comes particularly useful when the noise covariance matrix is de- 
, pendent upon the parameters of the model and so must be calcu- 

■ lated and inverted at each likelihood evaluation. However, such 
' benefits come with limitations. Since MOPED only guarantees 
, maintaining the Fisher matrix of the likelihood at a chosen point, 
' multimodal and some degenerate distributions will present a prob- 
lem. In this paper we report on some of the limitations of the ap- 
plication of the MOPED algorithm. In the cases where MOPED 
does accurately represent the likelihood function, however, its com- 
pression of the data and consequent much faster likelihood evalu- 
ati on does provide orde rs of magnitude improvement in runtime. 
In iHeavensetalJ tOOd) . the authors demonstrate the method by 
analysing the spectra of galaxies and in lOupta & Heavens! ilOoj) 
they illustrate the benefits of MOPED for estimation of the CMB 
power sp ectrum. The problem of "badly" behaved likelihoods was 
found by IProtopapas et alj l l2005h for the problem of light transit 
analysis; nonetheless, the authors present a solution that still allows 
MOPED to provide a large speed increase. 

We begin by introducing MOPED in Section 2 and define the 
original and MOPED likelihood functions, along with comments 
on the potential speed benefits of MOPED. In Section 3 we intro- 
duce an astrophysical scenario where we found that MOPED did 
not accurately portray the true likelihood function. In Section 4 we 
expand upon this scenario to another where MOPED is found to 
work and to two other scenarios where it does not. We present a dis- 



cussion of the criteria under which we believe MOPED will accu- 
rately represent the likelihood in Section 5, as we ll as a discussion 
of an i mplementation of the solution provided bv lProtopapas et all 
( I2OO5I) . 



2 DATA COMPRESSION WITH MOPED 

Full d etails of the MOPED method are given in iHeavens et al.l 
( I2OOOI) , here we will only present a limited introduction. 

We begin by defining our data as a vector, x. Our model de- 
scribes X by a signal plus random noise, 

x = u(6>T) + n(6»T), (1) 

where the signal is given by a vector u{6) that is a function of the 
set of parameters G — {9i} defining our model, and the true param- 
eters are given by 6t- The noise is assumed to be Gaussian with 
zero mean and noise covariance matrix A/jfc — (rijUk), where the 
angle brackets indicate an ensemble average over noise realisations 
(in general this matrix may also be a function of the parameters G). 
The full likelihood for N data points in x is given by 

exp|-i[x-u(0)]W(0)-i[x-u(0)]|.(2) 

At each point, then, this requires the calculation of the determinant 
and inverse of an x A'^ matrix. Both scale as A'^^, so even for 
smaller datasets this can become cumbersome. 

MOPED allows one to eliminate the need for this matrix in- 
version by compressing the N data points in x into M data values. 
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one for each parameters of the model. Additionally, MOPED cre- 
ates the compressed data values such that they are independent and 
have unit variance, further simplifying the likelihood calculation on 
them to an 0{M) operation. Typically, A/ <^ N so this gives us a 
significant increase in speed. A single compression is done on the 
data, X, and then again for each point in parameter space where we 
wish to compute the likelihood. The compression is done by gener- 
ating a set of weighting vectors, hi{Op) (i = 1 . . . M), from which 
we can generate a set of MOPED components from the theoretical 
model and data. 



1/,(6>F)=b,(6>F)-X = b, 



(3) 



Note that the weighting vectors must be computed at some assumed 
fiducial set of parameter values, Op - The only choice that will truly 
maintain the likelihood peak is when the fiducial parameters are the 
true parameters, but obviously we will not know these in advance 
for real analysis situations. Thus, we can choose our fiducial model 
to be anywhere and iterate the procedure, taking our likelihood 
peak in one iteration as the fiducial model for the next iteration. 
This process will converge very quickly, and may not even be nec- 
essary in some instances. For our later examples, since we do know 
the true parameters we will use these as the fiducial {Op = 0t) in 
order to remove this as a source of confusion (all equations, how- 
ever, are written for the more general case). Note that the true pa- 
rameters, Ot, will not necessarily coincide with the peak Oo of the 
original likelihood or the peak 9m of the MOPED likelihood (see 
below). 

The weighting vectors must be generated in some order so that 
each subsequent vector (after the first) can be made orthogonal to 
all previous ones. We begin by writing the derivative of the model 
with respect to the ith parameter as ^le^ = u,i(0F). This gives 
us a solution for the first weighting vector, properly normalised, of 



Ar(6>F)-^u,i(6>F) 



(4) 



v/u,i(6»f)W(6»f)-1u,i(6>f) 

The first compressed value is yi = bi (0f)^x and will weight 
up the data combination most sensitive to the first parameter. The 
subsequent weighting vectors are made orthogonal by subtracting 
out parts that are parallel to previous vectors, and are normalized. 
The resulting formula for the remaining weighting vectors is 

h^iOp) = (5) 

AA(6»f)-^u - Er=i' (u ™(6»F)%(6»F))b,(6>F) 

u,„(0f)W(0f)-1u,„(6>f) - E"? (u „(0F)Tb,(0F))2 ' 



where m — 2 . . . M. Weighting vectors generated with Equa- 
tions and ^ form an orthnomal set with respect to the noise 
covariance matrix so that 



(6) 



This means that the noise covariance matrix of the compressed val- 
ues yi is the identity, which significantly simplifies the likelihood 
calculation. The new likelihood function is given by 

1 



'Cmoped(^) ~ 



(27r)M/2 



e^pS^~^f2iu^(dF)-{y^}{e;eF)f^, (7) 

where yi{0F) = bi(0F)^x represents the compressed data and 
{yi) {0;6f) = bi(0F)^u(0) represents the compressed signal. 



This is a much easier likelihood to calculate and is time-limited by 
the generation of a new signal template instead of the inversion of 
the noise covariance matrix. The peak value of the MOPED likeli- 
hood function is not guaranteed to be unique as there may be other 
points in the original parameter space that map to the same point 
in the compressed parameter space; this is a characteristic that we 
will investigate. 

MOPED implicity assumes that the covariance matrix, M, is 
independent of the parameters. With this assumption, a full likeli- 
hood calculation with A'^ data points would require only an 0{N^) 
operation at each point in parameter space (or 0{N) if A/" is diago- 
nal). In MOPED, however, the compression of the theoretical data 
is an O(AIN) linear operation followed by an 0{M) likelihood 
calculation. Thus, one loses on speed if M is diagonal but gains 
by a factor of N/M otherwise. For the data sets we will analyze, 
N /M > 100. We begin, though, by assuming a diagonal M for 
simplicity, recognizing that this will cause a speed reduction but 
that it is a necessary step before addressing a more complex noise 
model. One can iterate the parameter estimation procedure if neces- 
sary, taking the maximum likelihood point found as the new fidu- 
cial and re-analyzing (perhaps with tighter prior constrain t s); thi s 
procedure is recommended for MOPED in iHeavens et akl ( |2000|) . 
but is not always found to be necessary. MOPED has the additional 
benefit that the weighting vectors, bi, need only to be computed 
once provided the fiducial model parameters are kept constant over 
the analysis of different data sets. Computed compressed param- 
eters, {yi), can also be stored for re-use and require less memory 
than storing the entire theoretical data set. 



3 SIMPLE EXAMPLE WITH ONE PARAMETER 

In order to demonstrate some of the limitations of the applicability 
of the MOPED algorithm, we will consider a few test cases. These 
originate in the context of gravitational wave data analysis for the 
Laser Interferometer Space Antenna (LISA) since it is in this sce- 
nario that we first discovered such cases of failure. The full problem 
is seven-dimensional parameter estimation, but we have fixed most 
of these variables to their known true values in the simulated data 
set in order to create a lower-dimensional problem that is simpler 
to analyse. 

We consider the case of a sine-Gaussian burst signal present 
in the LISA detector. The short duration of the burst with respect 
to the motion of LISA allows us to use the static approximation 
to the response. In fr equency space, the waveform is described 
bv jFeroz etlllbOIOl) ) 



h{f) = Af exp {-iQ'(^)'} exp(27rzto./). 



(8) 



Here A is the dimensionless amplitude factor; Q is the width of the 
Gaussian envelope of the burst measured in cycles; fc is the central 
frequency of the oscillation being modulated by the Gaussian enve- 
lope; and to is the central time of arrival of the burst. This waveform 
is further modulated by the sky position of the burst source, and 
0, and the burst polarisation, as they project onto the detector. 
The one-sided noise power spectral density of the LISA detector is 



Investigation into MOPED 3 



Original log -likelihood as a funclion of I 



MOPED component as a function of f 



-2000 - 
-4000 - 



- Originai iil\elilnood 



MOPED iog-ijl(eiihood a; 




- MOPED likeiihood 




Figure 1. The original and MOPED log-likelihoods as a function of fc for 
the chosen template. 



Figure 2. The value of the MOPED compressed pai'ameter as a function of 
the original frequency parameter. 



given by jFeroz et alldZOlOh ) 

Shif) = 16sm'(27r/t£) X 

(2 (1 + cos(27r/tL) + COs'(27r/tL)) Spmif) 
+ (1 + cos(27r/tz,)/2) Ssnf) , (9) 

•lO-^Hzy\ Sacc 
/ J P ' 



1 + 



(10) 



where tL = 16.678s is the hght travel time along one arm of the 
LISA constellation, Sacc = 2.5 x 10~''*Hz~^ is the proof mass 
acceleration noise and Ssn = 1.8 x 10~^^Hz"^ is the shot noise. 
This is independent of the signal parameters and all frequencies are 
independent of each other, so the noise covariance matrix is con- 
stant and diagonal. This less computationally expensive example 
allows us to show some interesting examples. 

We begin by taking the one-dimensional case where the only 
unknown parameter of the model is the central frequency of the 
oscillation, fc. We set Q = 5 and to = ICs; we then analyze a 
2048s segment of LISA data, beginning alt = 9.9 x lO^s, sampled 
at a Is cadence. For this example, the data was generated with ran- 
dom noise (following the LISA noise power spectrum) at an SNR 
of ~ 34 with fc,T = O.lHz (thus fc,F = O.lHz for MOPED). The 
prior range on the central frequency goes from lO^^Hz to 0.5Hz. 
10, 000 samples uniformly spaced in fc were taken and their likeli- 
hoods calculated using both the original and MOPED likelihood 
functions. The log-likelihoods are shown in Figure [T] Note that 
the absolute magnitudes are not important but the relative values 
within each plot are meaningful. Both the original and MOPED 
likelihoods have a peak close to the input value fc,T- 

We see, however, that in going from the original to MOPED 
log-likelihoods, the latter also has a second peak of equal height 
at an incorrect fc. To see where this peak comes from, we look at 
the values of the compressed parameter (yi) {fc; /c,f) as it varies 
with respect to fc as shown in Figure|2] The true compressed value 
peak occurs at fc ~ O.lHz, where i/i (/c,f) = (yi) {fc\fc,F). 
However, we see that there is another frequency that yields this ex- 
act same value of (yi) (/c;/c,f); it is at this frequency that the 
second, incorrect peak occurs. By creating a mapping from fc to 
(yi) (/ci /c,f) that is not one-to-one, MOPED has created the pos- 



sibility for a second solution that is indistinguishable in likelihood 
from the correct one. This is a very serious problem for parameter 
estimation. 



4 RECOVERY IN A 2 PARAMETER CASE 

Interestingly, we find that even when MOPED fails in a one- 
parameter case, adding a second parameter may actually rectify the 
problem, although not necessarily. If we now allow the width of the 
burst, Q, to be a variable parameter, there are now two orthognal 
MOPED weighting vectors that need to be calculated. This gives 
us two compressed parameters for each pair of fc and Q. Each of 
these may have its own unphysical degeneracies, but in order to 
give an unphysical mode of equal likelihood to the true peak, these 
degeneracies will need to coincide. In Figure [3] we plot the con- 
tours in {fc, Q) space of where {yi) {0; Of) = {yt) {Om; Of) as 
ranges over fc and Q values. We can clearly see the degenera- 
cies present in either variable, but since these only overlap at the 
one location, near to where the true peak is, there is no unphysical 
second mode in the MOPED likelihood. Hence, when we plot the 
original and MOPED log-likelihoods in Figure [4] although the be- 
haviour away from the peak has changed, the peak itself remains in 
the same location and there is no second mode. 

Adding more parameters, however, does not always improve 
the situation. We now consider the case where Q is once again fixed 
to its true value and we instead make the polarisation of the burst, 
?/;, a variable parameter. There are degeneracies in both of these 
parameters and in Figure[5]we plot the contours in (/c, i/')-space 
where the compressed values are each equal to the value at the 
maximum MOPED likelihood point. These two will necessarily 
intersect at the maximum likelihood solution, near the true value 
(fc = 0.1 Hz and V' ~ 1-3 rad), but a second intersection is also 
apparent. This second intersection will have the same likelihood as 
the maximum and be another mode of the solution. However, as we 
can see in Figure |6] in the left plot, this is not a mode of the origi- 
nal likelihood function. MOPED has, in this case, created a second 
mode of equal likelihood to the true peak. 

For an even more extreme scenario, we now fix to the true ^ 
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Figure 3. Contours of (yi) {9; Op) = {yi){0M\0F) and 
{y2)(9;0p) = {y2) {9m\0f) as they vary over /c and Q. The 
one intersection is the true maximum likehhood peak. 



MOPED component contours 
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Figure 5. Contours of (j/i) (0; 9f) = (j/i) {9\ 9f) and (j/2> (9\ 9f) = 
(2/2) {9; 9f) values as they vary as functions of fa and i/i. 
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Figure 4. Contours of the original and MOPED log-Hkelihoods (left and 
right, respectively). The MOPED likelihood has been multiplied by a con- 
stant factor so that its peak value is equal to the peak of the original likeli- 
hood. Contours are at 1, 2, 5, 10, 20, 30, 40, 50, 75, and 100 log-units below 
the peak going from the inside to outside. 



and allow the time of arrival of the burst to to vary (we also define 
Ato = to — to,T)- In this scenario, the contours in (/c, Ato)-space 
where (j/i) (0; 9f) = (yi) (Om; Of) are much more complicated. 
Thus, we have many more intersections of the two contours than 
just at the likelihood peak near the true values and MOPED creates 
many alternative modes of likelihood equal to that of the original 
peak. This is very problematic for parameter estimation. In Figure|7] 
we plot these contours so the multiple intersections are apparent. 
Figure [8] shows the original and MOPED log-likelihoods, where 
we can see the single peak becoming a set of peaks. 



Original log-likelihood MOPED log-likelihood 
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Figure 6. Contours of the original and MOPED log-likelihoods (left and 
right, respectively). The MOPED likelihood has been multiplied by a con- 
stant factor so that its peak value is equal to the peak of the original likeli- 
hood. Contours are at 1, 2, 5, 10, 20, 30, 40, 50, 75, and 100 log-units below 
the peak going from the inside to outside. 



5 DISCUSSION AND CONCLUSIONS 

What we can determine from the previous two sections is a gen- 
eral rule for when MOPED will generate additional peaks in the 
likelihood function equal in magnitude to the true one. For an 
Af-dimensional model, if we consider the {M — 1) -dimensional 
hyper-surfaces where it/i) {6\6f) = (yi) {9m',0f), then any 
point where these M hyper-surfaces intersect will yield a set of 0- 
parameter values with likelihood equal to that at the peak near the 
true values. We expect that there will be at least one intersection at 
the likelihood peak corresponding to approximately the true solu- 
tion. However, we have shown that other peaks can exist as well. 
The set of intersections of contour surfaces will determine where 
these additional peaks are located. This degeneracy will interact 



Investigation into MOPED 5 



MOPED comporeni 




Figure 7. Contours of (yi) {0; 0p) = (yi) {9; Op) and (j/2) (0; 0f) = 
(j/2) {0\Of) values as they vary as functions of /c and tg. We can see 
many intersections liere otlier than the true peak. 
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Figure 8. Contours of the original and MOPED log-hkelihoods (left and 
right, respectively). The MOPED likelihood has been multiplied by a con- 
stant factor so that its peak value is equal to the peak of the original likeli- 
hood. Contours are at 1, 2, 5, 10, 20, 30, 40, 50, 75, and 100 log-units below 
the peak going from the inside to outside. 



will have saved a considerable amount of time, since MOPED has 
be en demonstrated t o provi de speed-ups of a factor of up to 10^ 
in iGupta & Heavens! j2002l) . However, if there are multiple inter- 
sections then it is necessary to map the original likelihood to know 
if they are due to degeneracy in the model or were created erro- 
neously by MOPED. In this latter case, the time spent finding the 
MOPED likelihood surface can be much less than that which will 
be needed to map the original likelihood, so relatively little time 
will have been wasted. If any model degeneracies are known in ad- 
vance, then we can expect to see them in the MOPED likelihood 
and will not need to find the original likelihood on their account. 

One possible way of determining the validity of degenerate 
peaks in the MOPED likelihood function is to compare the original 
likelihoods of the peak parameter values with each other. By using 
the maximum MOPED likelihood point found in each mode and 
evaluating the original likelihood at this point, we can determine 
which one is correct. The correct peak and any degeneracy in the 
original likelihood function will yield similar values to one another, 
but a false peak in the MOPED likelihood will have a much lower 
value in the original likelihood and can be ruled out. This means 
that a Bayesian evidence calculation cannot be obtained from using 
the MOPED likelihood; however, the algorithm was not designed 

to be able to provide this. 

The solution for this problem presented in IProtopapas et al] 

( I2OO5I) is to use multiple fiducial models to create multiple sets of 
weighting vectors. The log-likelihood is then averaged across these 
choices. Each different fiducial will create a set of likelihood peaks 
that include the true peaks and any extraneous ones. However, the 
only peaks that will be consistent between fiducials are the cor- 
rect ones. Therefore, the averaging maintains the true peaks but 
decreases the likelihood at incorrect values. This was tested with 
20 random fiducials for the two-parameter models presented and 
was found to leave only the true peak at the maximum likelihood 
value. Other, incorrect, peaks are still present, but at log-likelihood 
values five or more units below the true peak. When applied to the 
full seven parameter model, however, the SNR threshold for signal 
recovery is increased significantly, from ~ 10 to ~ 30. 

The MOPED algorithm for reducing the computational ex- 
pense of likelihood functions can, in some examples, be extremely 
useful and provide orders of magnitude of improvement. However, 
as we have shown, this is not always the case and MOPED can 
produce erroneous peaks in the likelihood that impede parameter 
estimation. It is important to identify whether or not MOPED has 
accurately portrayed the likelihood function before using the results 
it provides. Some solutions to this problem have been presented and 
tested. 



with the model's intrinsic degeneracy, as any degenerate parame- 
ters will yield the same compressed values for different original 
parameter values. 

Unfortunately, there is no simple way to find these contours 
other than by mapping out the (y^) (0; Op) values, which is equiv- 
alent in procedure to mapping the MOPED likelihood surface. The 
benefit comes when this procedure is significantly faster than map- 
ping the original likelihood surface. The mapping of {y.i) {G; Op) 
can even be performed before data is obtained or used, if the fidu- 
cial model is chosen in advance; this allows us to analyse properties 
of the MOPED compression before applying it to data analysis. 
If the MOPED likelihood is mapped and there is only one con- 
tour intersection, then we can rely on the MOPED algorithm and 
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